OPTIMAL HIGH-SPEED MULT I -RESOLUTION RETRIEVAL METHOD ON LARGE 

CAPACITY DATABASE 

BACKGROUND OF THE INVENTION 

5 

Field of the Invention 

The present invention relates to an optimal high-speed 
multi-resolution retrieval method on a large capacity database, 
and more particularly to a technique for inducing an inequality 
10 capable of allowing an accurate and rapid retrieval of desired 

information from a database, and implementing an optimal high- 
speed information retrieval using the induced inequality. 



Description of the Related Art 

15 In order to search for the best match to a query based on 

a similarity measure, an exhaustive search should be performed 
literally for all data contained in a database. However, 
straightforward exhaustive search algorithms require a large 
quantity of calculation. Thus, a variety of high-speed search 

20 algorithms have recently been proposed. 

Berman and Shapiro have proposed introduction of a 
triangular inequality so as to remove candidates having no 
possibility to be determined as the best match (es) , from a 
retrieval procedure. For a reduction of additional calculation 

25 quantity, they have also proposed to simultaneously use diverse 
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distance measures and representative data called "key data". 
However, this method involves a considerable variation in 
retrieval speed depending on "key data", and exhibits an 
insufficient speed performance in association with large 
5 capacity databases. 

Recently, Berman and Shapiro has also proposed the 
application of a data structure called a "Triangle Trie" to 
achieve an improvement in performance. In this method, however, 
ri there is still a problem in that the retrieval speed is 

%j 10 considerably influenced by the tree depth and threshold value 

h- of "key data". 

yCl Meanwhile, Krishnamachari and Mottaleb have proposed a 

cluster-based indexing algorithm in which diverse data 
contained in a database are partitioned into clusters in such a 
. J^flS fashion that each cluster contains data having similar 

f3== features, in accordance with an architectural clustering 

scheme . 

In accordance with the cluster-based indexing algorithm, 
it is possible to remarkably reduce the quantity of calculation 
20 because query data is not compared with all data contained in a 

database, but compared with a part of the data in a retrieval 
procedure in accordance with the clustering scheme. 

In particular, the cluster-based indexing algorithm is 
suitable for large capacity databases in that the number of 
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comparisons to obtain a desired retrieval accuracy is not 
linearly proportional to the capacity of the database. 

Fig. 1 is a schematic diagram illustrating problems 
involved in conventional cluster-based search algorithms. - 
5 Referring to Fig. 1, the second cluster is selected as a 

candidate because its center C2 is nearest to the query Q. In 
accordance ' with the illustrated search algorithm, an element X2 
in the second cluster is selected as the best match, based on 
the distance of each element belonging to the second cluster 

10 from the query Q. However, the actual best match is the element 

X 8 of the fist cluster. 

The reason why such a problem occurs is that the center 
of the cluster, to which the actual best match belongs, is not 
always nearest to the query Q. To this end, a method for 

15 simultaneously searching for several near clusters has been 

proposed. However, this method cannot ensure an optimal 
retrieval inherently. 

Also, the conventional cluster-based search algorithms, 
which cannot ensure an optimal retrieval, have a drawback in 

20 that they cannot provide a retrieval speed sufficiently rapid 

to obtain a satisfactory retrieval accuracy. 

SUMMARY OF THE INVENTION 
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Therefore, the present invention has been made to solve 
the above mentioned problems, and an object of the invention is 
to provide an optimal high-speed multi-resolution retrieval 
method on a large capacity database for inducing an inequality 
capable of accurately determining effective clusters and 
ineffective clusters, and implementing an optimal high-speed 
information retrieval using the induced inequality. 

Another object of the invention is to provide an optimal 
high-speed multi-resolution retrieval method on a large 
capacity database for inducing an inequality based on a multi- 
resolution data structure for a high-speed processing, and 
implementing an optimal high-speed multi-resolution retrieval 
using the induced inequality. 

In accordance with the present invention, these objects 
are accomplished by providing an optimal high-speed multi- 
resolution retrieval method on a large capacity database, 
comprising the steps of: partitioning all data contained in a 
database into a desired number of clusters each composed of 
data having similar features; deriving the lower bound of the 
distance between each cluster and a query, removing clusters 
having no possibility to be determined as the best matches, and 
searching, for best matches, data in clusters having the 
possibility to be determined as the best matches; and inducing 
an inequality property based on a multi-resolution data 
structure for reducing unnecessary feature matching computation 



involved in a search procedure to reduce a large quantity of 
calculation . 

In accordance with this method, it is possible to 
accurately search not only for a single best match, but also 
for a plurality of more-significant best matches. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The above objects, and other characteristics and 
advantages of the present invention will become more apparent 
after a reading of the following detailed description when 
taken in conjunction with the drawings, in which: 

Fig. 1 is a schematic diagram illustrating problems 
involved in conventional cluster-based search algorithms; 

Fig. 2 is a schematic diagram illustrating distance 
inequality features for an optional cluster in accordance with 
the present invention; 

Fig. 3 is a schematic diagram illustrating the multi- 
resolution data structure of a brightness histogram X having 2 L 
bins; 

Fig. 4 is a diagram schematically illustrating a minimum 
distance arrangement of M more-significant best matches; and 

Fig. 5 is a schematic diagram illustrating an example in 
which an erroneous determination for best matches is made. 



DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Now, the configuration and effects of a preferred 
embodiment of the present invention will be described in detail 
5 with reference to the annexed drawings. 

Fig. 2 is a schematic diagram illustrating distance 
inequality property for an optional cluster in accordance with 
the present invention. 

Prior to a description of the distance inequality 
0 property for an optional cluster in accordance with the present 

invention, the clustering procedure for a database will be 
described. 

In accordance with the clustering procedure, the database 
is first divided into a predetermined number of clusters, that 
5 is, K clusters, using a MacQueen K-means clustering method, in 

order to allow data having similar features to compose one 
cluster. 

Here, the image data features may include information 
such as color, texture, and shape. In the case of audio data, 
0 information such as pitch may be usable for data features. Each 

of the K clusters has a mean center thereof. 

Since the computation required in the clustering 
procedure is conducted irrespective of an actual retrieval, the 
time taken to conduct the clustering procedure is not included 
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in the retrieval time. The clustering for the database is 
carried out as follows: 

Step 1) The number of clusters, K (K < N) , is determined. 
Step 2) The features of the cluster centers, Ci, C2, 
5 and C K , are initialized. K data are optionally selected from 

the database, as the initial cluster centers. For an efficient 
initialization, the minimum distance between two cluster 
centers is not to be less than a certain threshold value. 
„. Step 3) For the data other than the data selected as the 

JlO cluster centers, their nearest cluster centers are determined. 

f7 Each of the determined nearest cluster centers is included in 

fli 

[ft the cluster associated therewith. Thereafter, each cluster 

s center is updated, based on the following expression: 

mL5 [Expression 1] 

«(<DJ + 1 

where, "Xj." represents the current element to be added to 
20 the current cluster, "<J>k" represents the current cluster, and 

"n(<& k )" represents the number of elements belonging to the 
current cluster O k . 

Step 4) The third step is repeated for all elements. 
Thus, a cluster center set "IT 0 = {Ci, C 2 , C k } is finally 

25 derived. 
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Now, a scheme for solving problems essentially involved 
in the conventional cluster-based search algorithm shown in 
Fig. 1 will be described with reference to Fig. 2. 

First, an initial minimum distance is derived for the 

cluster nearest to a query Q, as expressed by the following 
expression: 

[Expression 2] 

' "■nun 

Based on Expression 2, the following expression may be 
induced: 

[Expression 3] 

min 

C kmin =argC A ^U°d(C k ,Q) 

In Expressions 2 and 3, "d(X,Y)" represents the Li- 
distance between two features X and Y. In accordance with the 
conventional algorithm shown in Fig. 1, the initial minimum 
distance d^n corresponds to "d(X 2 , Q)". Thereafter, the element 
furthest from the cluster center in each of clusters other than 
the cluster associated with the initial minimum distance 

*inin 

dmin is determined. The distance of the furthest element from 
the cluster center in each cluster, 5 k , is defined as follows: 
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[Expression 4] 

max 

S k =x, c= O k d(X n C k ) 

The distance 5 k of each cluster is calculated and stored 
in advance. Based on the initial minimum distance d^n and 5 k of 
each cluster, it is then determined whether or not it is 
necessary for the current cluster to be searched for in order 
to achieve an optimal retrieval. For this determination, 
Inequality Property 1 expressed by the following expression is 
used: 

[Expression 5] 

min 

If d{C k ,Q) -8 k > d min , then X, e O k d{X t ,Q) > d mm 

Expression 5 expressing Inequality Property 1 can be 
proved as follows: 

[Expression 6] 

min 

=argX i e® k d{X, 9 Q) 
20 

In accordance with the' triangular inequality method, the 
following inequality can be induced from Expression 5: 

[Expression 7] 
2 5 d(X imin , Q) > d{C k , Q) - d(X^ , C k ) 

9 
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In similar, the following inequality can be induced from 
Expression 4 : 

[Expression 8] 

max 

S k =X, e Q> k d(X i} C k )>d(X^,C k ) 

Using Expressions 7 and 8, the following inequality can 
be obtained: 



L"i [Expression 9] 

3 d(X^ , Q) > d(C k , Q) - d(X imm ,C k )> d{C k ,Q)-5 k 

pj If d(C k , Q) - 5 k > dmin, the following expression is then 

yJ5 , established: 

[Expression 10] 

min 

d(.X^,Q) = X^ eO, d(X n Q))d min 

20 Thus, the establishment of Expression 5 is proved. 

"d(C k , Q) - 5 k " in Inequality Property 1 means the lower 
bound of the distance between the query Q and any element in 
the current cluster <£> k . 

If "d(C k , Q) - 5 k " is more than "dmin", this means that 
25 there is no element spaced apart from the query Q by a distance 
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less than "drain" in the current cluster <B k . Accordingly , it is 
unnecessary to take the current cluster into consideration. 

In such a fashion, therefore, it is possible to surely 
remove all ineffective clusters by applying Ineguality 
5 Property 1 . 

However, the procedure for determining the nearest 
cluster and the procedure for determining the best match still 
require a considerable quantity of calculation. In order to 
pi reduce this calculation quantity, an optimal retrieval method 

CjjO is proposed in accordance with the present invention. According 

N= to this optimal retrieval method, another inequality property 

ytj is induced, based on a multi-resolution data structure. Based 

s on this inequality property, an optimal high-speed retrieval 

fU can be achieved. 

U|5 Fig. 3 is a schematic diagram illustrating the multi- 

resolution data structure of a brightness histogram X having 2 L 
bins . 

For the convenience of description, it is assumed that 
the multi-resolution data structure illustrated in Fig. 3 is 
20 associated with a normalized histogram having B (B = 2 L ) bins. 

The multi-resolution data structure of the histogram X may be 
defined by a histogram data stream of "{X°, X 1 , X L }". 

Here, X corresponds to "X L " . The histogram X 2 has 2 1 bins. 
This histogram X 1 is obtained by reducing the resolution of the 
25 histogram X i+1 by 50% (1/2). 
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Each pixel value of the histogram at the current level is 
obtained by summing together the values of two pixels in the 
histogram corresponding to the upper level adjacent to the 
current level. For example, assuming that X 1 (m) represents the 
value of the m-th bin in the histogram X 1 , this bin value X 1 (m) 
can be derived as follows: 

[Expression 11] 

X 1 (m) = X l+] (2m - 1) + X ux (2m), \<m<2 1 

Now, another inequality property, that is, Inequality 
Property 2, used for an optimal high-speed retrieval in a 
multi-resolution feature space will be described in detail. 
Inequality Property 2 can be expressed by the following 
expression : 

[Expression 12] 

d(X, Y) = d L (X, Y)>d L - ] (X,y)>-->ci l (X,Y)>-.-> d ] (X, Y)>d° (X, Y) 
. where, "d 2 (X, Y) " represents the Li-distance between two 
histograms X and Y at level 1, that is, "d(x\ Y 1 ) " . 

Expression 12 expressing Inequality Property 2 can be 
proved as follows: 



The Li-distance d (X, Y) between two histograms X and Y 
at level "1+1" can be derived as follows: 
[Expression 13] 

d^\XJ) = Y\X l+ \m)-Y^{m)\ 

5 m=\ 

where, each of the histograms X i+1 and Y i+1 has 2 i+1 bins, 
and X i+1 (m) represents the value of the m-th bin in the histogram 
D X J+1 . 

%1 Meanwhile, the Li-distance d J (X, Y) at level 1 can be 

yr| expressed as follows: 

s> [Expression 14] 

H-Jc d } (X,Y) = Y. \x l {m)-Y l {m)\ 

_ m=l 

m=l 

Since |A| + |B| > |A + B|, the following expression is 
established: 
[Expression 15] 

20 

2' 

£ (\x l+] (2m - 1) - (2m - 1)| + (2m) - (2/w)| ) 

m=\ 

> J \x M (2m - 1) - 7 /+l (2™ - 1) + X M (2m) - (2m)| 
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Based on Expression 13, 14 and 15, the following 
expression can be induced: 
[Expression 16] 

d M (x, Yy= -^^^U^^^tm^- > 

5 

|x /+1 (2m-l)-F /+1 (2m-l) + X /+, (2w)-7 /+, (2m)| 

Referring to Expression 16, it can be found that 
PI Expression 12 is established. Thus, Inequality Property 2 is 

iJO proved. 

L- Inequality Property 2 means that when "d 2 (X, Y) " is more 

gf| than a particular value, "d L (X, Y) " is always more than the 

s particular value. 

f|l The distance calculation at an upper level requires an 

W5 increased computation quantity as compared to that at a lower 

M level. Also, it is possible to remove an increased number of 

ineffective candidates at the lower level, as compared to the 
upper level. When such properties are applied to a search 
procedure, it is possible to considerably reduce the 
20 calculation quantity required for the search procedure. 

It is assumed that "N" represents the number of data 
contained in a database I (I = { Ii, I±, In})/ and "Q°" 

represents a set of the features of the data (Q° = {Xi, 
Xi, X N }). The multi-resolution features of each data are 

25 previously calculated and stored. 
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The high-speed multi-resolution exhaustive search 
algorithm (MSA S ) based on Inequality Property 2 can be 
summarized as follows: 



5 Step 1) The multi-resolution structure of a query Q is 

derived. 

Step 2) The initial minimum distance dmi n is set to be 
infinite. 

Step 3) Respective values of "i" and "2" are set to be 1. 
%h0 Step 4) If 1 = L, the procedure of the algorithm then 

fa* proceeds to step 6. If "i" is more than M N", the procedure then 

*0 proceeds to step 7 . 

Step 5) The value of "d 2 (X if Q) " is derived. If "d 2 (X if Q) " 
Rl is more than "d^n", the current candidate Xi is then removed. 

W5 Thereafter, respective values of "i" and M I" are updated with 

M «i + i" and "1". If not, the value of "1 " is updated with "i 

+ 1". Subsequently, the procedure then returns to step 3. 

Step 6) If x> d L (Xi, Q)" is more than "dmin", the current 
candidate X± is then removed. If not, "dmin" is updated with 
20 "d L (Xi, Q) " . Respective values of "i" and "1" are updated with 

"i + 1" and "1". Thereafter, the procedure returns to step 4. 

Step 7) data having the final ' x d min // is selected as the 
best match. 
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As apparent from the above description, the multi- 
resolution features of each data contained in the database are 
previously calculated, and then stored. However, the quantity 
of calculation for the multi-resolution features of query data 
should be taken into consideration because those multi- 
resolution features must be obtained within a search time. 

In the case of, for example, normalized brightness 
histogram features, only 254 additions are required in 
obtaining a desired multi-resolution histogram because the 
number of levels is 8. Accordingly, the quantity of calculation 
for the multi-resolution brightness histogram may be 
negligible, taking into consideration the fact that 511 
additions and 256 absolute value computations are required for 
one matching procedure. 

Meanwhile, although it is necessary to use an additional 
memory for storing multi-resolution histograms, such a memory 
addition may be negligible because the size of each histogram 
is considerably smaller than that of associated data. The same 
conditions are applied to other features. 

Now, a new cluster-based multi-resolution search algorithm 
(CMSA) for achieving an optimal high-speed information 
retrieval, using the optimal cluster * removal condition 
according to Inequality Property 1 and the MSA S based on 
Inequality Property 2 will be described. 



Once a query is given, the cluster center nearest to the 
query is first determined in accordance with MSA S . The distance 
dmin between the query and the initial best match of the cluster' 
associated with the nearest cluster center is then derived. 
5 Thereafter, the best match (es) are determined by applying 

MSA S to clusters, expected as having those best matches, in 
accordance with the cluster removal condition based on 
Inequality Property 1. Since the determination of the nearest 
cluster center is carried out based on MSA S/ there is no any 
0 value of "d L (C k , Q) " calculated in the cluster removal 

procedure . 

This is because if " d Ik (C k ,Q) " is more than "d^n" ', the 
1~ distances d lk+l (C k9 Q) 9 ~-,d L (C k9 Q) at " respective levels higher than 

ni the lk-th level are not calculated. 

ij|5 For this reason, there is a problem in that it is 

ju necessary to calculate again values of "d L (C k , Q) " for an 

introduction of Inequality Property 1, expressed by Expression 
5, in the cluster removal procedure. 

To this end, Inequality Property 1 is modified into 
20 Inequality Property 1.1 using the relation 

"d{C k ,Q) = d L (C k9 Q)>d l *{C k9 Q)" , as follows: 

[Expression 17] 

min 

If d''{C k ,Q)-S k >d^, then X, e O k d(X„Q) > d min 
25 where, I k ^ L 
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In accordance with Inequality Property 1.1, where 
" d l * (C k ,Q) - 8 k " is more than "cWn", it is possible to remove the 
current cluster G> k without any loss. 

On the other hand, where " d'*(C k9 Q)-S k " is less than 
"dmin"/ the current cluster O k is searched because the best match 
may be present in the current cluster O k . For this 

determination, there is no additional quantity of calculation 
because the values d lk (C k ,Q) and S k associated with each cluster 
are known in advance. 

Two CMSAs based on the above mentioned inequality 
properties are proposed according to the number of output best 
matches . 

The first algorithm is a CMSA S adapted to output a single 
best match, and the second algorithm is a CIYISA M adapted to 
output a plurality of best matches. 

The CMSA S mainly involves three processing steps. In 
accordance with this CMSA S , " C k " is first determined using 

^min 

MSA S . The initial minimum distance dmin is then derived from 
" <& k Finally, MSA S is applied to candidate clusters selected 

in accordance with Inequality Property 1.1, thereby determining 
the best match. The search procedure according to CMSA S can be 
summarized as follows: 
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Step 1) MSA S is carried out to determine the cluster k^n 
having a minimum distance d'min- 

Step 2) MSA S is applied to M ^k m \ n " under condition in 
which it is assumed that the initial "dmin" corresponds to 
5 "d'min"/ thereby updating "dmin" as follows: 

[Expression 18] 

min 

d min =X, e k- d L (X„Q) 

mm / min v / ? J 

j€o Step 3-1) w k" is set to "1". 

f[ Step 3-2) If k = kmin, "k" is updated with "k + 1". On the 

\=J other hand, if k > K, the procedure proceeds to step 3-4 . 

^ Step 3-3) If " d lk (C k9 Q)-S k " is more than "d^", the 

ril current cluster is removed. If not, "dmin" is updated by 

[i5 applying MSA S to " O a " . After "k" is updated with "k + 1", the 

i^L procedure returns to step 3.-2. 

Step 3-4) Data having the final "dmin" is selected as the 
best match. 

In accordance with CMS Am, " C k " is first determined in 

"-min 

20 the same fashion as that in CMSA S . Thereafter, a minimum 

distance arrangement shown in Fig. 4 and adapted to store 
respective distance values of M more-significant best matches 
is filled with those best matches in accordance with a rule to 
be described hereinafter. 
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Fig. 4 is a diagram schematically illustrating a minimum 
distance arrangement of M more-significant best matches. 

If w(<D. ) > M, M more-significant best matches are filled 
in the arrangement in the ascending order, starting from that 
having the lowest value. 

If n(O k ) < M, respective distances of all elements 
present in " CD, " are calculated. The calculated values are 

K min 

stored in the arrangement in the ascending order, starting from 
the lowest value. The remaining elements, not filled with the 
calculated values, in the arrangement are stored . with the 
infinite value. Using a modified MSA S , it is possible -to 
determine M more-significant best matches present in N ' "• 
This modified algorithm is referred to as M MSA M " . This MSA M can 
be summarized as follows: 

Step 1) The multi-resolution features of the query Q are 
derived. 

Step 2) All elements present in "dmint'1" are initialized 
with the infinite value. 

Step 3) "i" and "I" are set to "1", respectively. 

Step 4) If 1 = L, the procedure proceeds to step 6. If i 
> «(0 Amin ), the procedure proceeds to step 7. 

Step 5) "d 2 (Xi, Q) " is calculated. If d J (X ± , Q) > d^M - 
1] , the current candidate X± is removed. Thereafter, "i" and 
"1" are updated with "i + 1" and "1", respectively. Following 
this updating, the procedure returns to step 3. If d J (Xi, Q) ^ 
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dmintM - I]. "I" is updated with "1 + 1". Following this 
updating, the procedure returns to step 3. 

Step 6) If d L (X if Q) > dmintM - 1] , the current candidate 
X± is removed. If not, "dmirJM - 1]" is updated with "d L (Xi, Q)". 
Following this updating, "cWn [ • ] " is arranged in the ascending 
order, starting from the lowest value. Thereafter, "i" and "1" 
are updated with "i + 1" and "1", respectively. Following this 
updating, the procedure returns to step 4 . 

Step 7) Finally, M data left in "dmmt"]" are selected as 
M most-significant best matches. 

As mentioned above, the updating of "dmint"]" is carried 
out by conducting the filling of "dmin [ * ] " while applying MSA M to 
" O a ", and then applying MS Am to each of ' those selected from 
the remaining clusters in accordance with Inequality Property 
1.1. 

Finally, data corresponding to "dminf"]" are selected as M 
more-significant best matches. Even in the case using the 
above-mentioned search algorithm, it may be practically 
impossible to accurately search for M more-significant best 
matches . 

Fig. 5 is a schematic diagram illustrating an example in 
which an erroneous determination for best matches is made. 

Although X 8 , X 4 , and X 2 are selected as three more- 
significant best matches in the case of Fig. 5, the actual 
third best match is not X 2/ but X 9 . To this end, a cluster 




removal condition relaxed from Inequality Property 1,1 may be 
induced by substituting "dmi n " for "dmintM - 1]". That is, the 
following Inequality Property 1.2 may be induced: 



5 [Expression 19] 

min 

If d(C k ,Q)-6 k >d min [M-l], then X, . e O k d( X„ Q) > d min [M - 1] 
When the above inequality property is employed as a post- 
processing involved in the algorithm proposed in accordance 
^ with the present invention, it is always possible to accurately 

JjjO search for M more-significant best matches. The final CMSAm 

t-i using the above mentioned inequality properties can be 

ft! 

grj summarized as follows: 

s Step 1) The cluster kmin having the minimum distance d'min 

flj is searched for, as in step 1 of CMSA S . 

y|5 Step 2) If ^(O^) > M, M more-significant best matches are 

h- searched for in accordance with MSA M . Respective distance values 

of the searched more-significant best matches are stored in 
"dmint']". On the other hand, if n(0. ) < M, n(O k ) distance 

v ''min ' v *min 

values are filled in "d min [ , ] // in the ascending order, starting 
20 from the lowest value. The remaining elements of the 

arrangement are stored with the infinite value. 
Step 3-1) "k" is set to "1". 

Step 3-2) If k = k^n, k " is updated with "k + 1". On the 
other hand, if k > K, the procedure proceeds to step 3-5. 
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Step 3-3) If d lk (C k ,Q)-S k > <4ni n [0], the cluster k is 
removed. After "k" is updated with "k + 1", the procedure 
returns to step 3-2. 

Step 3-4) If d h (C k9 Q)-S k < d^ntO], "cLnl']" is updated by 
5 applying MS Am to " ® k " . After k " is updated with "k + 1", the 

procedure returns to step 3-2 . 

Step 3-5) "k" is set to "1". 

Step 3-6) If it is determined that the cluster k has been 
^ searched for at step 3-4 , "k" is then updated with "k + 1". If 

j!o , k > K, the procedure proceeds to step 3-9. 

Jj: Step 3-7) If d l *(C k9 Q)-'S k > d^ntM - 1], the cluster k is 

*f% removed. After "k" is updated with "k + 1", the procedure 

g returns to step 3-6. 

nj Step 3-8) If d l '(C k ,Q)-S k < dnanIM - 1], "d^n [ • ] " is 

ii 5 updated by applying MSAm to " O a " . After "k" is updated with "k 

y s + 1" , the procedure returns to step 3-6. 

Step 3-9) M data corresponding to the final "dmint*]" are 
selected as the top M best matches. 

As apparent from the above description, the following 
20 advantages are obtained in accordance with the optimal high- 

speed multi-resolution retrieval method on a large capacity 
database proposed by the present invention. 

First, the method of the present invention can be used as 
the important module of a search engine for any system used for 
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a high-speed optimal retrieval on a large capacity database, 
for example, an image or video database. 

Second, the method of the present invention is 
applicable to any multimedia database, which is capable of 
having a multi-resolution structure for image or audio data, 
to accurately and rapidly search the database for desired 
information . 



